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1 Introduction 



Kecent advances in pattern classification have enabled the development of sophisticated software 
systems that can recognize natural language input such as speech [6] or handwriting [71 These 
applications allow users to communicate with the system in a natural and convenient way and 
permit the automation of tasks that previously required human input Some examples of 'such 
applications mclude interactive voice response (TVR) systems, automated cheque-processing 
systems, and automated form data-entry systems. 

In addition, the growth of networked computing and the Internet has enabled the development of 
complex distributed systems, and the existence of open, standardized protocols has allowed the 
integration of end-user devices, centralized servers, and applications. An example of a three-tiered 
distributed system architecture is depicted in Figure 1. The combination of distributed computing 
and pattern recognition techniques has made possible the development of systems such as 
Netpage [1] an interactive paper-based interfece to online information. Systems such as these 
give users the ability to interact with their information from any location mat provides network 
coimectovity (including wireless network access) using familiar human-communication techniques 
such as handwriting or speech. 4 

This paper discusses some of the technical issues involved with integrating pattern recoenition 
^£ a .f ttibUted ™nment, and proposes a genenc^woTf" Sung 
distributed recognition i using centralized recognition servers and distributed context processing 
Also discussed are techniques for managing the user-specific customisation and adaptation thatis 
required to make pattern recognition systems accurate and flexible. 



1,1 Cross-References 



Various methods systems and apparatus relating to the present invention are disclosed in the 

S2^Tr7 ?r PP °f. filCd b . y * C appUcant ° r aSsiencc of *" P resent kvenoou. The 
disclosures of aU of these co-pending applications are incorporated herein by cross-reference. 

5 October 2002: 

AustraUan Provisional Application 2002952259, "Methods and Apparatus (NPT019)". 
15 October 2002: 

PCT/AU0^01395' PCT/AUO2/01392 ' rCT/AU02/01393, PCT/AU02/01394 and 
26 November 2001: 

PCTSu01 I /0^531 , PCT/AU01/01528 ' p CT/AU01/01529, PCT/AU01/01530 and 

11 October 2001: 

PCT/AU0I/0I274. 

14 August 2001: 

PCT/AU01/00996. 
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27 November 2000: 



PCT/AUOO/01442, PCT/AUO0/01444, PCT/AUOO/01446, PCT/AUOO/01445. PCT/AUOO/01450 
PCT/AUOO/01453, PCT/AUO0/O1448, PCT/AUOO/01447, PCT/AU00/O1459 PCT/AU0O/O145l' 
PCT/AUOO/01454, PCT/AUOO/01452, PCT/AUOO/01443, PCT/AU00/O1455, PCT/AUOO/01456' 
PCT/AUOO/01457, PCT/AU00/01458 and PCT/AU00/01449. 
20 October 2000: 

PCT/AU00/O1273, PCT/AU00/01279, PCT/AUOO/01288, PCT/AUOO/01282, PCT/AUOO/01276 
PCT/AU00/01280, PCT/AUOO/01274, PCT/AU00/O1289, PCT/AUOO/01275, PCT/AUOO/01277' 
PCT/AU00/01286, PCT/AUOO/01281, PCT/AUO0/O1278.PCT/AU00/O1287, PCT/AUOO/01285' 
PCT/AU0O/O1284andPCT/AU00/01283. v-i/«uuwwizb3, 

15 September 2000: 

PCT/AU00/O1 108, PCT/AU00/01 1 10 and PCT/AU00/O1 111. 
30 June 2000: 

SS2aJS2SS?" PCT/AUOO/007 «, PCT/AUOO/00761, PCT/AU00/OO760, PCT/AUOO/00759 
^ A y°°{ 0<>758, PCT/A UO0/0O764, PCT/AU00/OO765, PCT/AUOO/00766, PCT/AUOO/00767 
PCT/AU00/00768. PCT/AUOO/00773, PCT/AUOO/00774, PCT/AU00/00775 PCT/AUOO/00776' 
PCT/AU00/00777, PCT/AU00/00770, PCT/AUOO/00769 PCT/AU00 00771 PcSKSSSm 
PCT/AU00/00754, PCT/AUOO/00755, PCT/AUO0/OO756 and PCT/AU00/007S7 
24 May 2000: 

PCT/AU00/00518, PCT/AUO0/OO519, PCT/AU00/0052O, PCT/AUOO/00521, PCT/AU0O/0O522 
PCT/AU00/00523, PCT/AUOO/00524, PCT/AUOQ/00525, PCT/AUOO/00526 PCT/AU0OT0S27 
S™™° 528 ' PCT/AU 00/00529, PCT/AU00/0O53O. PCT/AUOO/00531. PCT/AUOO/00532' 
PCT/AUOO/00533, PCT/AUO0/0O534, PCT/AUOO/00535. PCT/AU00/0O536 PCT/AU00fl0537 
PCT/AUOO/00538, PCT/AUOO/00539, PCT/AU00/0O54O, PCT/AU00/00541 PctS 00542 

IS'^Z' 005 * 4 ' PC ™™>™*»5 PCT/AUOO/OOsS SSSS 
PCT/AU0O/OO554, PCT/AUOO/00556, PCT/AUOO/00557, PCT/AU00/0O558 PCT/AU00/00559 
PCT7AU00/00560, PCT/AUOO/OOSc-l, PCT/AU00/00562 PCT/AUOO/OO5S PCIVAUOoS' 
PCT/AUOO/00565, PCT/AUOO/00566, PCT/AU00/00567 PCT/AU00/0056S PCT^uSsS' 

^ AU00/00571 - PCT/AU00/00572 PCT/AU00/00573 PCtSuOoSJ' 
nS/ A ^°° 575, PCT/AUO °/00576, PCT/AUOO/00577, PCT/AUOO/00578 PCT/AU00 00579 
PCT/AUOO/00581, PCT/AU00/00580, PCT/AU00/00582, PCT/AUOO/00587 PCT^UOO 00588 
SSmt^S' P 5I/AUOO/00583, PCT/AU00/0O593 PCT/AuSSS JSXSS S. ' 
PCT/AUOO/00592, PCT/AUOO/00594, PCT/AUOO/00595, PCT/AUOO/00596 PCT/AUOO 00597 
PCT/AUOO/00598, PCT/AUO0/0O516, PCT/AUOO/00517 and PCT/AU00/005U 

1 .2 Pattern Recognition 

The basic processing steps of a pattern recognition system are depicted in Figure 3. Processing 
begins when an input device generates a signal mat is to be recognized by me system (that is, to 
be classified as belonging to a specific class or sequence of class elements). Usually, one or more 
pre-processing procedures are applied to remove noise and normalize the signal, which is then 
segmented to produce a stream of primitive elements required for the classification procedure 
Note that often this segmentation is "soft", meaning that a number of potential segmentation 
points are located, and the final segmentation points are resolved during classification or context 
processing. 

^ s fP M 5 nted 1 si ^ fll ■ *« P 855 ^ '0 a classifier where a representative set of features is 
extracted from die signal and used in combination with a pre-defined model of the input simal to 
produce a set of symbol hypotheses. These hypotheses give an indication of the probability that a 
sequence of segments within the signal represent a basic symbolic element (e.g. letter, word, 
phoneme etc.). After classification, the context-processing module uses the symbol hypotheses 
generated by the classifier to decode the signal according to a specified context model (such as a 
dictionary or character grammar). The result produced by the context processing is passed to the 
application for interpretation and processing. 
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1.3 Context Processing 



Natural language input is inconsistent, noisy, and ambiguous, leading to potential recognition and 
decoding errors. However, high recognition accuracy is required for pattern recognition 
applications to operate successfully, since mistakes can be expensive and fiustrating to users As a 
result, recognition systems must make use of as much contextual information as possible to 
increase the possibility of correctly recognizing the input. For example, when recognizing a signal 
that must represent a country name, the recognition system can use a pre-defined list of valid 
country names to guide the recognition procedure. Similarly, when recognizing a phone number, a 
limited symbol set (lc digits) can be used to constrain the recognition results. 

The problem domain for many pattern recognition systems is inherently ambiguous (i e many of 
the input patterns encountered during processing cannot be accurately classified without further 
information from a different source). As an example, one of the .major issues faced in the 
development of highly accurate handwriting recognition systems is the inherent ambiguity of 
handwriting (e.g. the letters «u' and 'V, «t' and T, and «g' and y are often written with a very 
similar appearance and are thus easily confused). Human readers rely on contextual knowledge to 
correctly decode handwritten text, and as a result a large amount of research has been directed at 
applying syntactic and linguistic constraints to handwritten text recognition [8 9 10 11 12 131 

s^ssr*-* to ** fie,d of *-* — -> iaa ^ e 



1.4 Current Systems 

Paragraph offers a network-based distributed handwriting recognition system called NetCalif m 
that is based on their Calligraphy handwriting recognition software. "The user's natural 
handwntmg - cursive, print, or a combination of both - is captured by compact client software, 
then transmitted from the Internet-connected device to die NetCalif servers where it is converted 
and returned as typewritten text to the device" [2]. 

i* eechMa S ic ' " a client/server-based, professional speech recognition software 
package [3]. This system supports specialized vocabularies (called ConTexts) and "dictation, 
titelSem^' pf CO,reCtion 630 be done ' independent^ * ^ location, across a LAN, WAN, or 

1.5 Detailed Description of the Preferred Embodiments 

In the preferred embodiment, the invention is configured to work with the Netpage networked 
compu ter system^ a deraded description of which is given in our co-pending applications 
^^Sra particular PCT application WO0242989 entitled "Sensing Device" filed 30 May 
2002, PCT apphcation WO0242894 entitled "Interactive Printer" filed 30 May 2002 PCT 
appbcatoon WO0214075 JTnterfece Surface Printer Using Invisible Ink" filed 21 February' 2002, 
fiPI i^r 1 '^ 00242950 APP^ Interaction With A Network ComputoTystem" 
filed 30 May 2002, and PCT application WO03034276 entided "Digital Ink Database Searehmg 
Using Handwntmg Feature Synthesis" filed 24 April 2003. It will be appreciated that not every 
implementation will necessarily embody all or even most of the specific details and extensions 
described in these applications in relation to the basic system: However, the system is described in 
its most complete form to assist in understanding the context in which the preferred embodiments 
and aspects of the present invention operate. 

to brief summary, the preferred form of the Netpage system provides an interactive paper-based • 
interface to online information by utilizing pages of invisibly coded paper and an optically 
imaging pen. Each page generated by the Netpage system is uniquely identified and stored on a 
network server, and all user interaction with the paper using the Netpage pen is captured 
uiteipreted, and stored. Digital printing technology facilitates the on-demand printing of Netpa R e 
nZT\! r"? g in,erac J dve applications to be developed. The Netpage printer, pen, and 
nefcvork infrastaictoe provide a paper-based alternative to traditional screen-fctsed applications 
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Typically, a printer receives a document from a publisher or application provider via a broadband 
connection, which is printed with an invisible pattern of infrared tags that each encodes the 
location of the tag on the page and a unique page identifier. As a user writes on the page the 
imaging pen decodes these tags and converts the motion of the pen into digital ink. The digital ink 
is transmitted over a wireless channel to a relay base station, and men sent to the network for 
processing and storage. The system uses a stored description of the page to interpret the digital 
ink, and performs the requested actions by interacting with an application. 

AppUcations provide content to the user by publishing documents, and process the digital ink 
interactions submitted by the user. Typically, an application generates one or more interactive 
pages in response to user input, which are transmitted to the network to be stored, rendered, and 
finally panted as output to the user. The Netpage system allows sophisticated applications to be 
developed by providing services for document publishing, rendering, and delivery, authenticated 
transactions and secure payments, handwriting recognition and digital ink searching, and user 
validation using biometric techniques such as signature verification. 



2 Distributed Pattern Recognition 

An example architecture for a distributed pattern recognition system is depicted in Figure 9 t„ 
recognition as depicteH FWre 2 » P * to use a mechanism for distributed 



2.1 Symbol DAG 



derived irom me input signal'STn t Iff^St ^ 
structure, in combination with a context model, to decode Ae LpnuSaf «« tms 

matrix represent K PtSTdec^dSa S h2 fo "° W Symbo1 - ^ * c 

These paJhs and assoK^SlfiS ^LSfiS? ""t^ ° f 0,6 pattem classifier - 

decode the input signal cla *«ton scores can be combined with a context model to My 
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Note that the symbol DAG is applicable in any pattern recognition task where a sequence of 
classification results is decoded using a context or set of constraints. The symbols contained in the 
symbol DAG may be any primitive element that is generated as the output of a pattern classifier, 
including the output from a time-series classifier. Examples of such recognition systems include 
handwriting and speech recognition, protein sequencing [14], image processing and computer 
vision [15], and econometrics [16]. 

2.2 Symbol DAG Example 

As an example, Table 1 shows a symbol DAG that represents the output from a handwritten 
character recognizer generated by the ambiguous text given in Figure 4. In this example, the 
recognizer has found two possible character segmentation arrangements, as depicted by the two 
rows in the symbol DAG. Note that in the examples, the symbol scores are given as probabilities; 
however, an actual implementation will typically use log-probabilities (i.e. the base-10 logarithm 
of the probability result) to improve the performance of context processing and to avoid overflow 
and underflow problems that occur when multiplying probabilities using finite precision floating- 
point operations. 

To decode the alternatives, the context processor starts with the first entry in the DAG (i.e. the 
character V). The score for this entry is added to the accumulated total (since log-probabilities 
are added rather than multiplied), and processing moves to the column given by the offset value in 
the entry (in this example, column 1). In column 1, two alternatives exist (Le. "cl" or "cb"), and 
the scores for these alternatives are found by adding the scores to the previous total. The decoding 
continues until the end of the DAG is reached. Similarly, the second entry in column 0 (Le. the 
character M') is decoded; note however, that column 1 is skipped in this traversal of the DAG, as 
indicated by the ofifeet value of 2 in the character score entry. This is due to the letter 'd* being 
constructed using two strokes, and thus the recognition of the letters T and 'b* cannot be valid in 
this alternative. Thus, the potential decoding alternatives in this example are: 

clog = 0.7 * 0.8 * 1.0 * 1.0 - o.ss 
Cbg - 0.7 * 0.2 * 1.0 » 0.14 

dog = 0.3 * 1.0 * 1.0 * 0.30 

These values can now be combined with a language model or other contextual information to 
select the most likely word. 
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Table 1. Example DAG for "clog^'dog" Ambiguity 



The DAG structure must ensure that strokes are assigned to an individual letter only once. To do 
this, alternate paths must be defined to ensure that if a stroke is assigned to a letter, no subsequent 
letter may use that stroke in its construction. An example of this is given in Figure 5, with the 
derived DAG depicted in Table 2. In this example, the short, horizontal marks can potentially be 
recognized as crossbar elements of a letter t\ or diacritical marks for the letter T. However, if a 
marking is used as a crossbar, it cannot subsequently be used as a diacritical. The potential 
decoding alternatives in this example are: 



tile - 0.6 * 1.0 * 0.6 * 1.0 - 0.36 
tlte » 0.6 * 1,0 * l.o * 1.0 - 0.60 
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lite » 0.4 * 1.0 * 1.0 * 1.0 o 0.40 

These values can now be combined witha language model to select the most likely word. 
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Table 2. Example DAG for "lite'Vtlle" Ambiguity 

affcCc^ 1 ^ 016 ' ^ ° f 8 ? A ° "** ^ * set t0 * ero - a NUL character 

woM h^T^-H^ z 063 001 chan se the test, but will modify the text probability). Thisloiows 
Zln^S m£T ^ to be modeled 85 a SPACErtMUL pau-, mdicating^Lt ft«e ^ 

certam probability that a space appears at that point in the DAG. For example- 
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Table 3. Example DAG for SPACE/NUL Pair 



The potential decoding alternatives in this example 



ab a l.o * o.S * l.o * 0.6 

a b - 1.0 * 0.4 * 1.0 - 0.4 



23 Distributed Recognizer (Management 

me system, smce the recogmtion manager acts as a controller for the set of recognizers. 
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2.4 User-Specific Dictionaries 

Distributed recognition systems can also support user dictionaries, which are user-specific word 
lists (and possibly associated a-priori probabilities) that include words that a user writes 
frequently but which are unlikely to appear in a standard dictionary (examples include company 
names, work or personal interest specific terms, etc.). User dictionaries can be stored and 
managed centrally so that words added to the dictionary when using one application are available 
to all applications for context processing. Obviously, applications can manage and use their own 
local user-specific dictionaries if required, since they have full control over context decoding. 

When an application requires the recognition of a signal that may contain words found in the user 
dictionary (e.g. standard handwritten text input such as the subject line of an e-mail or an arbitrary 
voice message), the centralized recognition system generates the usual intermediate recognition 
results to be returned to the application for context decoding. However, in addition to this it 
decodes the intermediate results using the user-dictionary as a language model, the result of which 
is also returned to the application. These two intermediate results structures can be combined by 
the application during its context decoding to generate a final decoding that includes the user- 
specific dictionary information. 

2.5 User-Specific Training 

Distributed recognition systems may also support user-specific training for recognizers, as 
depicted in Figure 8. The data generated by a user-specific recognition training application is 
submitted to the centralized recognition manager, which stores the data in a database. The 
recognition manager then enumerates all recognizers to determine if they support the data format 
as defined by the parameters associated with the training data, and if so, submits the training data 
to the recognizer for user-specific training. 

When an existing recognizer is upgraded or a new recognizer is added to the system, the 
recognition manager queries the training database to determine if any training data of the format 
required by the recognizer exists. If so, the training data is submitted to the newly registered 
recognizer for processing, as depicted in Figure 9. 



3 Conclusion 

A number of techniques that allow pattern recognition to be performed in a distributed system 
have been discussed, including a method of reuniting the intermediate results generated by a 
pattern classifier to an application for context processing. In addition to this, techniques for 
managing multiple recognizers, user-specific dictionaries, and the user-specific training of 
recognizers have been given. 
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Figures. Baste Pattern Recognition 




Figure 4. Ambiguous Input Ink for "clogTdog" 




Figure 5. Ambiguous Input Ink for "tile'VUte" 
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Figure 7. Recognizer Selection Scenario 
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Figure 8. Recognizer Training Scenario 
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Figure 9. Recognizer Registration Scenario 
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